#####################################################################################################
### Replication Instructions For: 							          ###
### STRONG STATE WEAK ENFORCEMENT: BUREAUCRATIC FORBEARANCE OF CHINA'S SOCIAL INSURANCE POLICY.   ###
### POLITICAL SCIENCE RESEARCH AND METHODS 						          ###
#####################################################################################################

For questions and information, please contact Hao Zhang <hao.z@nyu.edu> or Ye Zhang <ye_zhang@mit.edu>  
Last Updated: January 30, 2026



############################
### COMPUTE ENVIRONMENT ####
############################

# The analysis is performed under R 4.4.1 and Stata 14.0 MP. The operating system is Windows 11 Home 64-bit. 
# Please download all data in their ORIGINAL FORMATS for replication codes to work.
# Replication codes assume that the codes and the data are in the designated folder.



#################################
####### REQUIRED PACKAGES #######
#################################

Packages for R 4.4.1

# readstata13 - version: 0.10.1
# data.table - version: 1.15.4
# tidyverse - version: 2.0.0
# stargazer - version: 5.2.3
# doBy - version: 4.6.22
# lfe - version: 3.0-0
# dplyr - version: 1.1.4
# DescTools - version: 0.99.55
# haven - version: 2.5.4
# readxl - version: 1.4.3
# mgcv - version: 1.9-1
# cenGAM - version: 0.5.3


Package for Stata 14.0

# shp2dta
# spmap



###############################################################################################
########### REPLICATING ALL RESULTS IN MAIN TEXT AND SUPPORTING INFORMATION  ##################
###############################################################################################

# Please download the replication folder before running any code. All R scripts assume that the initial working directory is the source file location. To replicate, first run R scripts "Table 1.R" and "Table 2.R" in the "main text" folder to generate "firm analysis.RData" and "individual analysis.RData" in the "analysis data" folder. The rest of the R scripts in the "main text" and "appendix" folders can then be run in any order.

# To replicate Figure 1, run "china map preprocessing.R" first and then run "maps.do".   

# Note that certain data files are restricted due to legal provisions. However, it is straightforward to obtain all raw data through well-known data sources specified in the last section of this file.



############################################################################
########################### LIST OF FILES ##################################
############################################################################


---- RAW DATA ---
[1] "cfps 2012-2017.RData"                       (restricted, CFPS) CFPS survey data with city linkages (2012-2017)       
[2] "cies 1998-2007.dta"                         (restricted, CIES) CIES firm survey data (1998-2007)
[3] "cies name 1998-2007.dta"                    (restricted, CIES) CIES firm id to name linkage (1998-2007)
[4] "city mayor 1990-2015.dta"                   (restricted, CGOD) CGOD mayor data (1990-2015)
[5] "city panel 1995-2010.dta"                   city panel covariates (1995-2010)
[6] "city panel 1995-2011.dta"                   city panel covariates (1995-2011)
[7] "city panel 1996-2014.xlsx"                  city panel covariates (1996-2014)
[8] "city panel 2010-2019.xlsx"                  city panel covariates (2010-2019)
[9] "city secretary 1990-2015.dta"               (restricted, CGOD) CGOD party secretary data (1990-2015)
[10] "city spatial distance.dta"                 Chinese city spatial distance data
[11] "citycode.xlsx"                             citycode crosswalk
[12] "citylist encoding.xlsx"                    citycode crosswalk for strikemap data
[13] "county panel 1997-2018.csv"                county panel covariates (1997-2018)
[14] "strikemap 20210717-updated.xlsx"           strikemap data (20210717)
[15] "synthetic data"                            synthetic data

# NOTE: City panel covariates come from China City Statistical Yearbook. Due to the data structure and availability, they are broken down into four files covering different covariates and time periods. County panel covariates are from China County Statistical Yearbook. All datasets posted here are for journal verification purpose only. PLEASE access and cite the original data sources.

The synthetic datasets are provided to illustrate the structure, variable types, and coding of the original restricted data while complying with confidentiality and data-use agreements. Synthetic datasets were generated by drawing each value with replacement from the pool of observed values in the corresponding variable. All direct identifiers (e.g., names, ID codes, geographic labels) have been removed and replaced with generic integer labels (1, 2, 3, …). We do not provide a synthetic version of cies_name_1998–2007.dta because it only contains two linking variables (unique firm ID and real firm names) 


---- ANALYSIS DATA ---
 [1] "firm analysis.RData"                       (restricted, CIES+CGOD) dataset for firm-level analysis
 [2] "individual analysis.RData"                 (restricted, CFPS+CGOD) dataset for individual-level analysis


---- MAIN ANALYSIS ----
 [1] "Figure 1"             
 [2] "Figure 2.R"  
 [3] "Figure 3.R"
 [4] "Table 1.R"             
 [5] "Table 2.R"  


---- Appendix ----
[1] "A1.R"   
[2] "A2.R"   
[3] "A3.R"   
[4] "B1.R"   
[5] "B10.R"  
[6] "B11.R"  
[7] "B12.R"  
[8] "B13.R"  
[9] "B14.R" 
[10] "B2.R"   
[11] "B3.R"   
[12] "B4.R"   
[13] "B5.R"   
[14] "B6.R"   
[15] "B7.R"   
[16] "B8.R"   
[17] "B9.R"   
[18] "C1.R"  
[19] "C2.1.R" 
[20] "C2.2.R" 
[21] "C3.R"   

---- Others ----
<README.txt>				ReadMe  



###############################################################################################
################################ RESTRICTED DATASETS ##########################################
###############################################################################################

The firm census data, China Industrial Enterprise Survey (CIES), that support the findings of this study are available through China’s National Bureau of Statistics and its authorized data sellers (e.g., certain universities). The individual survey data, China Family Panel Studies (CFPS), are collected and provided by the Institute of Social Science Survey (ISSS) at Peking University, which is available at https://www.isss.pku.edu.cn/cfps/en/ with further permission of ISSS. Both datasets are not publicly available due to government and university regulations. Finally, the mayor and the party secretary data can be accessed by purchasing yearly subscription to the "China Government Official Data" from the Chinese Research Data Services Platform developed by Shanghai Jing He Co.,Ltd. at https://www.cnrds.com/Home/Login#/.